Skip to main content

All Questions

0votes
0answers
52views

YOLOv1 - Why do we predict multiple bounding boxes?

When you look at the YOLOv1 paper and corresponding implementations it is always mentioned that for every grid cell, we predict B bounding boxes (usually two). Then we use IoU to choose the ...
Lockhart 's user avatar
0votes
1answer
153views

Transfer learning using pretrained tensorflow object detection model [closed]

I am new to AI/ML and wanted to seek guidance as I am totally lost. I will simplify my issue as follows: Let's say I would like to detect apples and oranges in images. I would like to leverage a pre-...
Doug's user avatar
  • 125
-1votes
1answer
110views

Why do we need Tensorflow, Keras and other ML/AI modules?

This question might seem stupid at first glance, and it might be - that is because I am very new here and I've tried to think about an answer of my own, and search this question but to find no answer.....
0Interest's user avatar
0votes
0answers
734views

Object detection: when there's only 1 object in each image

Good day. I have a custom dataset for object detection, which has imbalance that each image has only one object annotation. I trained the object detection model(Efficientdet-dx) on TensorFlow object ...
Kim's user avatar
2votes
1answer
1kviews

Tensorflow object detection model total loss starts out good, but suddenly explodes up to high loss numbers

I'm training a Tensorflow object detection model with approx. 7500 images of two classes, which contains approx. 10,000 classes per class. I'm using Tensorflow 2.6.0, in case that is relavent. I am ...
sneeze_shiny's user avatar
0votes
1answer
330views

How to handle an unbalanced dataset when training object detection algorithms?

I am training an object detection model, and I have some very highly unbalanced data annotations. I have almost 11,000 images, all with dimensions of 1024 $\times$ 1024. Within those images I have the ...
sneeze_shiny's user avatar
0votes
1answer
2kviews

Is it possible that the fine-tuned pre-trained model performs worse than the original pre-trained model?

I have downloaded a pre-trained EfficientDet D2 model (Tensorflow 2.0) and trained it on some data (about 20000 images with 20 classes). I set the number of steps to 25000 and batch size to 3 (...
Araw's user avatar
  • 103
4votes
1answer
3kviews

What are the main differences between YOLOv3 and RetinaNet object detection algorithms?

I am looking at a certain project that compares performance on a certain dataset for an object detection problem using YOLOv3 and RetinaNet (or the "SSD_ResNet50_FPN" from TF Model Zoo). ...
Ananda's user avatar
0votes
1answer
362views

Bounding Box Regression - An Adventure in Failure [closed]

I've solved many problems with neural networks, but rarely work with images. I have about 18 hours into creating a bounding box regression network and it continues to utterly fail. With some loss ...
David Hoelzer's user avatar
0votes
1answer
444views

Is it possible to modify or replace the basic network of YOLO?

I have an idea to adapt YOLO algorithm to my application, the original YOLO algorithm is for image classifications, which have 24 convolutional layers with output class of 1000, is it possible to ...
ColinGuolin's user avatar
1vote
1answer
50views

Should I prefer cropped images or realistic images for object detection?

I am new to the field of AI but due to the high level of abstraction that comes with services such as Google VisionAI I got motivated to write an application that detects symbols in photos based on ...
Vale's user avatar
1vote
0answers
21views

Irregular results while prediction identical object on same image

I used the pre-trained model faster_rcnn_resnet101_coco.config with my own dataset. I have two issues some objects were not detected, while I learned it, with a high number of steps, and test over ...
Ahmed Salem's user avatar
1vote
1answer
140views

Should I use single or double view for gender recognition?

My project requires gender recognition of people shown on the given images, with more than one person per image. However, these people can be positioned in frontal or side view(passing by ...
GKozinski's user avatar
2votes
1answer
68views

Why tf object detection api needs so few pictures?

I am wondering why tf object detection api needs so few picture samples for training while regular cnns needs many more? What I read in tutorials is that tf object detection api needs around 100-500 ...
GKozinski's user avatar
3votes
0answers
30views

How to voxelize multiple frames at the time and append them together?

I'm trying to implement this approach for object detection and tracking. In this approach, the first step is voxelize each frame to construct a 3D tensor, the second step is to append multiple voxels ...
OneManArmy's user avatar

close